Calibrated Nonparametric Scan Statistics for Anomalous Pattern Detection in Graphs

نویسندگان

چکیده

We propose a new approach, the calibrated nonparametric scan statistic (CNSS), for more accurate detection of anomalous patterns in large-scale, real-world graphs. Scan statistics identify connected subgraphs that are interesting or unexpected through maximization likelihood ratio statistic; particular, (NPSSs) with higher than expected proportion individually significant nodes. However, we show recently proposed NPSS methods miscalibrated, failing to account over multiplicity subgraphs. This results both reduced power subtle signals, and low precision detected subgraph even stronger signals. Thus develop statistical approach recalibrate NPSSs, correctly adjusting multiple hypothesis testing taking underlying graph structure into account. While recalibration, based on randomization testing, is computationally expensive, an efficient (approximate) algorithm new, closed-form lower bounds (on maximum nodes given size, under null no patterns). These advances, along integration recent core-tree decomposition methods, enable CNSS scale large graphs, substantial improvement accuracy Extensive experiments semi-synthetic datasets demonstrated validate effectiveness our comparison state-of-the-art counterparts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scan Statistics for the Online Detection of Locally Anomalous Subgraphs

OF DISSERTATION Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Statistics The University of New Mexico Albuquerque, New Mexico

متن کامل

Fast generalized subset scan for anomalous pattern detection

We propose Fast Generalized Subset Scan (FGSS), a new method for detecting anomalous patterns in general categorical data sets. We frame the pattern detection problem as a search over subsets of data records and attributes, maximizing a nonparametric scan statistic over all such subsets. We prove that the nonparametric scan statistics possess a novel property that allows for efficient optimizat...

متن کامل

Bayesian Network Scan Statistics for Multivariate Pattern Detection

We review three recently proposed scan statistic methods for multivariate pattern detection. Each method models the relationship between multiple observed and hidden variables using a Bayesian network structure, drawing inferences about the underlying pattern type and the affected subset of the data. We first discuss the multivariate Bayesian scan statistic (MBSS) proposed by Neill and Cooper (...

متن کامل

Scan Statistics on Enron Graphs

We introduce a theory of scan statistics on graphs and apply the ideas to the problem of anomaly detection in a time series of Enron email graphs. Corresponding author: Carey E. Priebe =

متن کامل

Scan Statistics for Interstate Alliance Graphs

In this paper we discuss work on graphs defined in terms of alliances between countries. We will use scan statistics to investigate years in which there are an unusual number of agreements, not just between one country and its allies, but amongst the allies themselves. This is related to work on email “chatter” discussed in Priebe et al. [2006]. In this section we will lay out the basic graph t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i4.20339